Large-scale identification of human protein function using topological features of interaction network
نویسندگان
چکیده
The annotation of protein function is a vital step to elucidate the essence of life at a molecular level, and it is also meritorious in biomedical and pharmaceutical industry. Developments of sequencing technology result in constant expansion of the gap between the number of the known sequences and their functions. Therefore, it is indispensable to develop a computational method for the annotation of protein function. Herein, a novel method is proposed to identify protein function based on the weighted human protein-protein interaction network and graph theory. The network topology features with local and global information are presented to characterise proteins. The minimum redundancy maximum relevance algorithm is used to select 227 optimized feature subsets and support vector machine technique is utilized to build the prediction models. The performance of current method is assessed through 10-fold cross-validation test, and the range of accuracies is from 67.63% to 100%. Comparing with other annotation methods, the proposed way possesses a 50% improvement in the predictive accuracy. Generally, such network topology features provide insights into the relationship between protein functions and network architectures. The source code of Matlab is freely available on request from the authors.
منابع مشابه
Topological Characterization of Protein-Protein Interaction Networks in Human and Mouse
The elucidation of the cell's large-scale organization is a primary challenge for post-genomic biology, and understanding the structure and topological properties of protein-protein interaction networks offers an important starting point for such studies. We compare the protein-protein interaction network of the human and mouse, aiming to uncover the network's generic large-scale properties and...
متن کاملScale-space measures for graph topology link protein network architecture to function
MOTIVATION The network architecture of physical protein interactions is an important determinant for the molecular functions that are carried out within each cell. To study this relation, the network architecture can be characterized by graph topological characteristics such as shortest paths and network hubs. These characteristics have an important shortcoming: they do not take into account th...
متن کاملConstruction and Analysis of Tissue-Specific Protein-Protein Interaction Networks in Humans
We have studied the changes in protein-protein interaction network of 38 different tissues of the human body. 123 gene expression samples from these tissues were used to construct human protein-protein interaction network. This network is then pruned using the gene expression samples of each tissue to construct different protein-protein interaction networks corresponding to different studied ti...
متن کاملStructure discovery in PPI networks using pattern-based network decomposition
MOTIVATION The large, complex networks of interactions between proteins provide a lens through which one can examine the structure and function of biological systems. Previous analyses of these continually growing networks have primarily followed either of two approaches: large-scale statistical analysis of holistic network properties, or small-scale analysis of local topological features. Mean...
متن کاملLink Prediction using Network Embedding based on Global Similarity
Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...
متن کامل